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ABSTRACT 


We  have  created  a  combined  statistical-dynamical  model  to  predict  the 
probability  of  tropical  cyclone  (TC)  formation  at  daily,  2.5°  horizontal  resolution  in 
the  North  Atlantic  (NA)  at  intraseasonal  lead  times.  Based  on  prior  research  and 
our  own  analyses,  we  chose  five  large  scale  environmental  factors  (LSEFs)  to 
represent  favorable  environments  for  TC  formation.  The  LSEFs  include:  850  mb 
relative  vorticity,  sea  surface  temperature,  vertical  wind  shear,  Coriolis,  and  200 
mb  divergence.  We  used  logistic  regression  to  create  a  statistical  model  that 
depicts  the  probability  for  TC  formation  based  on  these  LSEFs.  Through 
verification  of  zero  lead  hindcasts,  we  determined  that  our  regression  model 
performs  better  than  climatology.  For  example,  these  hindcasts  had  a  Brier  skill 
score  of  0.04  and  a  relative  operating  characteristic  skill  score  of  0.72.  We  then 
forced  our  regression  model  with  LSEF  fields  from  the  NCEP  Climate  Forecast 
System  to  produce  non-zero  lead  hindcasts  and  forecasts.  We  conducted  a 
series  of  case  studies  to  evaluate  and  study  the  predictive  skill  of  our  regression 
model,  with  the  results  showing  that  our  model  produces  promising  results  at 
intraseasonal  lead  times. 
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percent  of  the  NTCI  time-location  blocks . 33 
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I.  INTRODUCTION 


A.  BACKGROUND 

Hurricane  Katrina  in  2005  demonstrated  the  devastating  force  of  a  tropical 
cyclone  (TC).  The  National  Hurricane  Center  (NHC)  estimated  the  death  toll 
from  Katrina  at  1,833  and  the  property  damage  toll  of  $81  billion  (NHC  2006).  In 
1900,  the  biggest  natural  disaster  in  U.S.  history,  the  Galveston  Hurricane  of 
1900,  claimed  over  8,000  lives  (Emanuel  2005).  Not  only  do  TCs  have  an  impact 
on  civilian  affairs,  but  they  have  destroyed  whole  militaries  and  with  it  the  dreams 
of  nations.  In  1565,  a  TC  destroyed  the  French  fleet  off  the  coast  of  St. 
Augustine,  Florida,  which  forced  the  French  to  surrender  Florida  to  Spain 
(Emanuel  2005). 

Looking  further  back  into  history,  Japan  might  be  under  Chinese  rule 
today  if  not  for  two  typhoons  (Emanuel  2005).  In  1274,  Kublai  Khan,  the 
grandson  of  the  infamous  Genghis  Khan,  tried  to  conquer  Japan  for  the  Mongols. 
Kublai  sent  40,000  men  on  900  ships  from  present-day  Korea  to  Japan.  Just  as 
the  ships  pulled  into  harbor,  a  typhoon  hit  the  coast  and  13,000  men  perished. 
Yet  Kublai  did  not  learn  his  lesson.  Just  seven  years  later,  Kublai  and  140,000 
men  set  sail  to  conquer  Japan;  however,  another  typhoon  struck  as  the 
Japanese  desperately  defended  the  coast.  Kublai  himself  managed  to  escape, 
but  he  left  his  men  to  die. 

In  recent  history,  Admiral  William  Halsey,  Jr.  made  the  mistake  of  letting 
not  one  but  two  typhoons  taint  his  career.  In  December  1944,  while  the 
commander  of  the  Third  Fleet,  Admiral  Halsey  decided  to  leave  his  forces  in  the 
path  of  Typhoon  Cobra  near  the  Philippines.  Though  they  had  time  to  escape 
the  path  of  Cobra,  three  destroyers  were  sunk  and  many  other  vessels  sustained 
damage  due  to  the  TC.  Also,  146  aircraft  were  lost  and  more  importantly  over 
800  seamen  lost  their  lives  due  to  Admiral  Halsey’s  poor  decision  (Melton  2007). 
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One  month  later,  Admiral  Halsey  again  found  his  forces  in  the  path  of  a 
typhoon.  Although  no  ships  were  lost,  six  lives  were  lost  and  75  planes  were 
destroyed.  A  Navy  court  of  inquiry  convened  on  both  occasions  and  found  the 
Admiral  guilty  of  bad  judgment;  however,  he  did  not  receive  any  punishment 
(Melton  2007). 

For  a  modern  day  example,  the  United  States  Navy  (USN)  conducts  an 
annual  exercise,  UNITAS  gold,  which  is  an  11 -nation  naval  exercise  in  the  North 
Atlantic  (NA),  conducting  anti-piracy  and  anti-drug  smuggling  training.  The 
military  planners  ask  one  to  three  months  before  the  exercise,  will  the  weather, 
specifically  TCs,  cooperate?  Right  now,  operational  forecasts  of  individual  TCs 
are  limited  mainly  to  lead  times  of  two  to  three  days.  For  longer  leads,  the 
weather  community  generally  only  provides  TC  climatology  as  a  guide  for  TC 
formation  in  the  NA. 

B.  CLIMATE  OSCILLATIONS  AND  IMPACTS  ON  TC  FORMATION 

TC  activity  undergoes  large  climate  scale  variations,  for  example, 
interannual  variations  in  TC  formations.  In  some  years,  the  Gulf  of  Mexico  has 
produced  eight  TCs,  while  in  other  years  the  Gulf  has  not  produced  any  TCs. 
Having  an  understanding  of  these  climate  variations  can  lead  to  better  TC 
formation  forecasting  because  it  accounts  for  the  variability  of  the  large  scale,  low 
frequency  conditions  that  influence  TC  formation. 

1.  El  Nino  and  La  Nina 

As  described  by  Hilldebrand  (2001),  El  Nino  (EN)  and  La  Nina  (LN)  events 
alter  the  circulations  in  the  NA  by  influencing  the  tropical  easterly  jet  and  creating 
anomalous  extratropical  wave  trains.  Though  ENLN  have  the  biggest  influence 
in  the  western  North  Pacific  (WNP),  they  also  alter  the  wind  shear  and  steering 
flow  in  the  NA.  These  circulation  changes  lead,  on  average,  to  more,  and  more 
intense,  TCs  in  the  NA  during  a  LN  than  an  EN.  Also,  during  a  LN,  more  TCs 
form  in  the  tropical  NA,  while  during  a  EN  more  storms  form  in  the  subtropical 
NA. 
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2. 


Madden-Julian  Oscillation 


As  described  by  Madden  and  Julian  (1994),  the  Madden  Julian  oscillation 
(MJO)  is  a  tropical  wave  that  has  lower  and  upper-level  anomalies  that  produce 
enhanced  convection.  The  convection  formed  by  the  MJO  can  provide  enough 
low  level  vorticity  that  a  TC  can  form,  given  that  other  necessary  conditions  are 
also  favorable.  Frank  and  Roundy  (2006),  showed  that  25  percent  of  the  TCs 
that  form  in  the  NA  did  so  when  the  convective  phase  of  a  MJO  was  present. 

3.  North  Atlantic  Oscillation 

The  North  Atlantic  Oscillation  (NAO)  represents  a  variation  in  atmospheric 
mass  between  southern  and  northern  dipoles  centered  near  the  Azores  and 
Iceland.  In  the  positive  NAO  phase,  there  is  above  (below)  normal  sea  level 
pressure  in  the  southern  (northern)  dipole;  the  opposite  is  true  in  the  negative 
NAO  phase.  Frank  and  Young  (2007)  show  that  a  positive  NAO  and  a  negative 
Southern  Oscillation  index  (associated  with  EN)  tends  to  lead  to  a  drastic 
decrease  in  TC  formations  in  the  NA. 

4.  Atlantic  Multi-decadal  Oscillation 

The  Atlantic  Multi-decadal  Oscillation  (AMO)  is  a  multi-decadal  variation  in 
NA  sea  surface  temperature  (SST)  and  other  atmospheric  and  oceanic  variables. 
The  warm  (cool)  phase  of  the  AMO  tends  to  coincide  with  an  increase  (decrease) 
in  NA  TC  activity  (Wikipedia  2009). 

C.  CLIMATE  PREDICTION  PROCESS 

1.  Definitions 

a.  Climatology 

TC  climatology  provides  a  time  averaged  description  of  TC  activity; 

for  example,  the  frequency  of  tropical  cyclone  formation  in  a  given  region  and 

period.  Generally,  the  time  average  is  constructed  from  30  or  more  years  of  data 

to  produce  a  long  term  mean  (LTM).  LTM  descriptions  of  TC  activity  can  be  used 

to  estimate  future  TC  activity,  and  they  can  provide  good  descriptions  of  that 
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activity  for  relatively  long  periods  and  large  regions.  Thus,  TC  climatologies  are 
commonly  used  as  a  standard  against  which  to  assess  forecasts.  However,  LTM 
climatologies  tend  to  do  poorly  for  short  periods  and  small  regions,  and  they  of 
course  do  poorly  in  describing  variations  from  LTMs  (e.g.,  interannual  variations 
associated  with  ENLN). 

b.  Smart  Climatology 

We  define  smart  climatology  as  state-of-the-science  climatology 
that  directly  supports  the  Department  of  Defense  (DoD).  Smart  climatology  takes 
advantage  of  modern  climate  data  sets,  and  modern  climate  analysis  and 
forecasting  methods,  to  better  account  for  the  full  range  of  spatial  and  temporal 
variability  in  the  climate  systems.  Smart  climatology  provides  major 
improvements  over  traditional  climatology,  which  is  limited  mainly  to  a  LTM 
perspective  based  on  observational  data  (as  opposed  to  analysis  or  reanalysis 
data).  U.S.  military  climatologies  are  almost  exclusively  traditional  climatologies. 

c.  Tropical  Cyclone 

Tropical  cyclone  (TC)  is  the  general  term  for  a  warm-core  cyclone 
that  forms  over  the  tropical  ocean  (Glickman  2000).  By  international  agreement, 
a  TC  is  further  broken  down  by  maximum  sustained  winds:  tropical  depression 
(TD)  less  than  17  m/s,  tropical  storm  (TS)  18-32  m/s,  and  hurricane  (or  typhoon 
or  cyclone)  33  m/s  or  greater  (Glickman  2000). 

d.  Intraseasonal  Forecast 

An  intraseasonal  forecast  is  a  forecast  with  a  lead  time  of 
approximately  14-70  days. 

e.  Large  Scale  Environmental  Factors  (LSEFs) 

A  large  scale  environmental  factor  (LSEF)  is  a  climate  system 
variable  that  has  significant  impacts  on  the  formation  of  TCs  (e.g.,  sea  surface 
temperature  (SST),  low  level  relative  vorticity,  vertical  wind  shear,  etc.). 

2.  Operational  Climate  Prediction  and  Long  Range  Forecasting 

Figure  1  from  Mundhenk  (2009)  outlines  the  processes  of  operational 
climate  prediction  and  long  range  forecasting.  For  this  study,  we  applied  all  but 
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the  final  steps  in  this  process.  These  steps  are  described  more  fully  in  Chapter  II 
and  explained  by  example  in  the  results  of  our  study  shown  in  Chapter  III. 


Data  Selection 
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Climate  System  Analysis 
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Forecast  Method  Development 
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Forecast 


Hindcast  Weighting/Selection 


Verification/Evaluation 
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Decision  Support  Product  Generation 
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Mission  Impact  Evaluation 

Figure  1 .  Outline  of  the  process  by  which  operational  climate  prediction  and  long 
range  forecasting  is  done.  From  Mundhenk  (2009). 


3.  Methods  of  Prediction 

One  of  the  most  complex  steps  in  the  climate  prediction  and  long  range 
forecasting  process  (Figure  1 )  is  step  three,  Forecast  Method  Development.  This 
step  involves  deciding  which  predictive  method  to  choose  from — in  particular,  a 
statistical,  dynamical,  ora  combined  statistical-dynamical  method. 


a.  Statistical 

Most  operational  climate  prediction  centers  (the  National  Weather 
Service’s  (NWS)  Climate  Prediction  Center  (CPC))  use  statistical  methods  to 
forecast  at  intraseasonal  to  interannunal  lead  times.  Statistical  methods  are 
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based  on  analyses  of  past  states  of  the  climate  system  that  provide  information 
on  the  probabilities  of  different  states  developing  in  the  future. 

b.  Dynamical 

Dynamical  methods  (e.g.,  methods  involving  the  use  of  numerical 
versions  of  the  dynamical  equations  for  the  atmosphere  and  ocean)  tend  to  be 
relatively  skillful  for  lead  times  out  to  about  two  weeks.  However,  the  skill  of 
dynamical  methods  tends  to  be  lower  than  that  of  statistical  methods  for  long 
lead  times.  Thus  many  dynamical  methods  are  used  for  shorter  lead  times  (e.g., 
less  than  one  month)  but  are  phased  out  in  favor  of  statistical  methods  for  longer 
lead  times.  However,  the  skill  of  dynamical  methods  at  longer  lead  times  has 
been  increasing.  One  example  of  the  application  of  dynamical  methods  for 
intraseasonal  to  seasonal  forecasting  is  the  National  Centers  for  Environmental 
Prediction  (NCEP)  Climate  Forecast  System  (CFS),  a  coupled  atmosphere- 
ocean  dynamical  model  system  used  by  CPC.  In  2005,  the  CFS  developers 
received  an  award  for  excellent  work.  The  occasion  marked  “the  first  time  in 
history  numerical  seasonal  predictions  were  on  par  with  empirical  methods”  (van 
den  Dool  2007). 


c.  Combined  Statistical-Dynamical 

Combinations  of  statistical  and  dynamical  methods  are  also  used, 
including  weighted  averaging  of  the  outputs  from  statistical  and  dynamical 
forecasts.  For  example,  some  predictions  of  the  number  of  TCs  that  will  form  in 
a  basin  during  a  TC  season  are  based  on  both  statistical  and  dynamical  outputs. 
For  example,  the  International  Research  Institute  for  Climate  and  Society  (IRI) 
uses  a  combined  method  to  experimentally  predict  the  number  of  TCs  that  will 
form  in  the  NA  between  August  and  October  (Camargo  and  Barnston  2009). 

For  this  study,  we  used  a  combined  statistical-dynamical  approach. 

We  developed  a  statistical  model  that  relates  the  LSEFs  to  the  probability  of  TC 

formation.  We  then  used  intraseasonal  predictions  of  the  LSEFs  from  the  CFS 

as  inputs  to  our  statistical  model.  The  resulting  output  from  the  combined 
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statistical-dynamical  model  combination  are  probabilistic  intraseasonal  forecasts 
of  TC  formation.  We  chose  the  CFS  because  it  is  freely  available  to  the  public 
and  it  provides  forecasts  for  lead  times  of  several  seasons.  There  are 
alternatives  to  the  CFS,  such  as  those  from  the  European  Centre  for  Medium- 
range  Weather  Forecasts  (ECMWF).  However,  the  output  from  these 
alternatives  is  not  as  readily  available  and/or  as  temporally  extensive. 

D.  EXISTING  PRODUCTS 

1.  Seasonal 

Dr.  William  Gray  at  Colorado  State  University  was  the  first  to  produce  a 
seasonal  TC  forecast  for  the  NA  (Camargo  2006).  These  and  other  seasonal  TC 
forecasts  predict  the  aggregate  number  of  TC  formations  in  a  given  ocean  basin 
for  the  overall  TC  season  and  the  basin,  without  specifying  the  time  or  location  of 
the  formations  within  the  season  and  basin. 

The  IRI  experimental  long  range  TC  forecasts  provide  a  probability  of 
below-normal,  normal,  or  above-normal  TC  formation  probabilities  for  August 
through  October  in  the  NA  basin.  Figure  2  shows  an  example  of  an  IRI  TC 
forecast;  IRI  issues  updates  at  the  beginning  of  every  month  throughout  the  TC 
season.  Such  broad  overview  forecasts  are  useful  but  are  limited  in  the  value 
they  provide  to  military  planners  and  other  planners  because  they  lack  specificity 
as  to  when  and  where  individual  TC  formations  are  more  and  less  likely. 


7 


Probability  Forecasts  for  Number  of  Tropical  Cyclones 


Month  Forecast  was  issued 

Figure  2.  IRI  probability  forecast  for  the  number  of  TCs  in  the  NA  during  August- 

October2009,  issued  by  IRI  on  1  April  2009  (IRI  2009). 

Other  prediction  centers  that  currently  produce  a  seasonal  forecast 
include:  Colorado  State  University,  IRI,  European  Centre,  City  University  of  Hong 
Kong,  NOAA,  and  Institute  of  Meteorology  of  Cuba. 

In  this  study,  we  have  developed  and  tested  a  statistical-dynamical 
approach  to  intraseasonal  forecasting  of  the  probabilities  of  formations  for 
specific  dates  (individual  weeks)  and  locations  (individual  2.5  x  2.5  degree 
regions). 

2.  Intraseasonal 

Intraseasonal  forecasts  provide  information  on  TC  activity  for 

intraseasonal  periods  (e.g.,  one  month  within  a  TC  season).  Intraseasonal 

forecasts  also  tend  to  be  relatively  specific  about  the  location  within  a  basin 

where  TC  activity  is  likely.  This  greater  temporal  and  spatial  specificity  means 

that  intraseasonal  forecasts,  if  they  are  skillful,  have  the  potential  to  be  much 

8 


more  useful  to  planners  than  seasonal  forecasts.  Some  of  the  long  range 
prediction  centers  mentioned  in  the  seasonal  section  are  also  developing 
intraseasonal  forecasts. 

Frank  and  Roundy  (2006)  used  the  Madden-Julian  oscillation  (MJO)  and 
other  tropical  waves  to  create  a  30-day  outlook  of  daily  probabilities  for  TC 
formation  around  the  globe.  Roundy  makes  these  experimental  forecasts  freely 
available  through  his  homepage  at  the  State  University  of  New  York  at  Albany 
(Albany  2009).  Leroy  and  Wheeler  (2009)  use  a  similar  approach  for  southern 
hemisphere  TCs,  using  five  predictors:  the  climatology  cycle,  two  associated  with 
the  MJO,  and  two  associated  with  SST.  Unfortunately,  these  forecasts  are  not 
freely  available  to  the  public. 

Though  these  statistical  methods  show  great  promise,  Camargo  (2006) 
states  that  “while  there  is  much  room  for  improvement  in  the  skill  and  application 
of  empirical/statistical  methods  of  intra-seasonal  TC  prediction,  the  greatest  hope 
for  improvement  lies  with  dynamical/numerical  models.”  The  ECMWF  produces 
such  a  dynamical  forecast  for  TC  formation  via  their  Ensemble  Prediction  System 
(EPS)  for  the  seven  ocean  basins;  however,  it  is  not  freely  available  to  the  public. 

The  CPC  issues  operational  intraseasonal  TC  forecasts  that  are  available 
free  online.  Figure  3  depicts  an  example  a  two-week  TC  formation  outlook  that  is 
part  of  the  CPC  Global  Tropics  Benefits/Hazards  Assessment.  This  product  has 
a  graphical  depiction  of  the  hazards  in  the  tropics  and  text  that  explains  the 
hazards.  The  text  that  explains  the  red  highlighted  area  labeled  “2”  in  the  middle 
of  the  NA  states  (CPC  2007): 

The  potential  for  tropical  cyclone  development  across  the  deep 
tropical  Atlantic  Ocean.  It  is  anticipated  that  a  northward  displaced 
and  enhanced  African  Easterly  Jet  will  continue  to  aid  in  the 
development  of  Robust  African  easterly  waves  and  with  areas  of 
above  average  SSTs  and  weak  vertical  wind  shear  the  chances  of 
tropical  development  are  increased.  Confidence:  Moderate. 
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Figure  3.  CPC  Global  Tropics  Benefits/Hazards  Assessment  for  21-27  August 

2007,  issued  by  CPC/NCEP  on  13  August  2007  (CPC  2007). 

This  CPC  product  is  generated  using  a  subjective  blending  of  forecast 
tools,  but  CPC  plans  to  make  this  product  more  objective  (Gottschalck  et  al. 
2008).  In  this  study,  we  have  explored  one  method  for  making  such  predictions 
more  objective  and  have  discussed  with  CPC  the  potential  for  applying  this 
method  to  improve  the  CPC  Global  Tropics  Benefits/Hazards  Assessment. 

3.  DoD  Products 

The  DoD  currently  does  not  produce  any  seasonal  or  intraseasonal  TC 
formation  forecast  products  for  any  ocean  basin.  The  Joint  Typhoon  Warning 
Center  (JTWC)  and  the  14th  Weather  Squadron  (14WS)  provide  very  limited 
climatology  products,  but  no  explicit  seasonal  or  intraseasonal  TC  forecasts  are 
available. 

E.  RESEARCH  MOTIVATION  AND  SCOPE 

1.  Closely  Related  Prior  Work 

Meyer  (2006)  took  a  statistical  approach  to  studying  TC  formation  regions 
in  the  WNP.  Though  his  research  only  provided  zero  lead  hindcasts  using  the 
National  Centers  for  Environmental  Prediction  (NCEP)  reanalysis  data,  Meyer 
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showed  that  if  the  LSEFs  can  be  predicted  at  long  leads,  we  should  be  able  to 
use  logistic  regression  to  calculate  long  range  forecasts  of  the  probabilities  for 
TC  formation. 

Mundhenk  (2009)  followed  Meyer’s  work  and  took  a  statistical-dynamical 
approach  to  predicting  TC  formation  regions  in  the  WNP.  Mundhenk  produced  a 
regression  model  that  was  trained  on  NCEP  reanalysis  data.  However,  he  then 
forced  that  model  with  operational  Climate  Forecast  System  (CFS)  fields  and 
produced  non-zero  lead  TC  formation  probabilities  for  the  WNP.  With  limited 
verification,  these  non-zero  lead  forecasts  appeared  to  provide  skill  and  value 
beyond  that  of  standard  climatology. 

2.  Research  Questions 

This  thesis  will  follow  the  path  of  Mundhenk  (2009)  and  address  the  same 
questions  for  the  NA  that  he  answered  for  the  WNP: 

1)  Can  favorable  regions  for  TC  formation  be  predicted  at  intraseasonal 

lead  times  in  the  NA  using  the  CFS? 

2)  Do  these  predictions  have  more  skill  than  standard  climatology? 

3.  Thesis  Outline 

Chapter  II  discusses  the  region  we  chose  to  investigate,  the  timeframe  we 
used,  and  the  data  sets  and  methods  used  in  creating  our  regression  model. 
The  various  LSEFs  that  were  investigated  are  discussed  as  well.  This  chapter 
also  summarizes  the  various  methods  we  used  to  verify  our  hindcasts  and 
forecasts. 

Chapter  III  provides  details  of  the  regression  model  and  the  verification  of 
our  model.  We  then  show  examples  comparing  our  TC  formation  forecasts  to 
that  of  standard  climatology.  Finally,  we  provide  examples  of  non-zero  lead 
forecasts  from  the  CFS. 
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In  Chapter  IV,  we  provide  a  summary  of  our  work,  our  conclusions,  and 
our  recommendations  for  future  endeavors  in  intraseasonal  TC  formation 
forecasting. 
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II.  DATA  AND  METHODS 


A.  STUDY  PERIOD  AND  REGION 

We  chose  the  NA  as  our  study  region  because  of  the  importance  of  NA 
TCs  for  the  United  States  and  military  operations  in  the  region.  Figure  4  depicts 
the  NA  region  we  investigated  during  this  study,  15°  W-100°  W  and  7.5°  N  to 
37.5°  N.  The  years  we  focused  on  are  1970-2007.  From  the  National  Oceanic 
and  Atmospheric  Administration  (NOAA)  best  track  data  (NOAA  2007),  there  are 
approximately  1 1  TC  formations  in  the  NA  per  year.  Note  from  Figure  4,  that  our 
NA  region  only  excludes  one  TC  that  formed  during  1970-2007.  Later  in  this 
chapter,  we  cover  in  more  detail  why  we  excluded  0°-7.5°  N  from  our  data  set. 


Figure  4.  The  NA  study  region  (outlined  by  the  black  box)  and  TC  formation 

points  during  1970-2007  (red  crosses),  constructed  from  NOAA  best  track 
data. 
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B.  DATA  SOURCES 


1.  NOAA  Best  Track 

NOAA’s  Atlantic  Oceanographic  and  Meteorological  Laboratory  maintains 
a  TC  best  track  data  set  called  HURDAT  (NOAA  2007).  Figure  4  shows  the  TC 
initial  locations  of  NA  TCs  during  1970-2007  as  identified  by  HURDAT.  Note  the 
large  number  of  formations  in  main  development  region  for  NA  TCs,  the  tropical 
Atlantic  at  10°N-20°N  and  between  northern  South  America  and  western  North 
Africa.  Unlike  the  JTWC  TC  best  track  archive,  HURDAT  only  provides  archives 
of  storms  that  became  a  TS  or  greater.  Therefore,  if  a  TC  only  reached  TD 
intensity,  it  would  not  be  captured  in  this  data  set.  Further,  HURDAT  only 
provides  TC  information  starting  from  the  time  and  location  at  which  the  TC 
reached  TD  intensity.  This  is  a  major  difference  between  Mundhenk  (2009)  and 
our  research  in  the  NA.  The  JTWC  best  track  data  used  by  Mundhenk  (2009) 
trace  individual  TCs  back  to  the  time  and  location  at  which  the  initial  convection 
could  be  identified.  The  JTWC  data  also  includes  storms  that  only  reached  TD 
intensity.  Thus,  for  example,  the  JTWC  data  has  formation  points  with  2  m/s  as 
the  maximum  sustained  winds,  while  in  the  NA,  the  initial  information  for  the  TCs 
goes  no  lower  than  13m/s  maximum  sustained  winds. 

Figure  5  shows  the  occurrence  by  month  of  NA  TCs  during  1970-2007, 
with  a  peak  during  the  months  of  July-October  (JASO).  For  our  model 
development,  we  extended  beyond  this  peak  period  a  little  and  used  HURDAT 
from  June-November.  We  chose  to  do  so  to  include  as  many  TCs  as  possible 
ion  our  model  development,  while  also  limiting  the  development  to  the  period  with 
the  greatest  TC  activity.  We  wanted  to  provide  as  many  storms  as  possible  to 
the  logistic  regression  model  to  ensure  optimal  model  performance. 
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Figure  5.  Number  of  NA  TC  formations  versus  month  of  the  year,  constructed 

with  NOAA  best  track  data  from  years  1970-2007. 

2.  NCEP  Reanalysis 

We  chose  the  NCEP/Department  of  Energy  Atmospheric  Model 
Intercomparison  Project-1 1  as  the  reanalysis  data  set  (hereafter  referred  to  as  R2) 
on  which  to  train  our  regression  model  (Kanamitsu  et  al.  2002).  The  R2  data  set 
covers  1979-present  at  a  T62L28  resolution.  The  R2  data  set  we  used  has  a 
2.5°  x  2.5°  horizontal  resolution  and  daily  temporal  resolution. 

3.  NOAA  OISST 

We  used  the  National  Oceanic  and  Atmospheric  Administration  (NOAA) 
optimum  interpolation  (Ol)  SST  analysis  version  2  (Reynolds  et  al.  2002).  NOAA 
uses  both  in  situ  and  satellite  observations  to  produce  a  1°  spatial  resolution  and 
a  weekly  temporal  resolution  covering  1982-present.  To  match  the  R2  and  SST 
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data  sets  to  match,  we  extrapolated  the  1°  spatial  resolution  SST  data  set  to  2.5° 
and  interpolated  the  weekly  temporal  resolution  SST  to  daily. 

4.  NCEP  CFS 

We  used  the  CFS  as  the  source  of  long  range  forecasts  of  the  LSEFs  to 
force  our  regression  model  when  generating  non-zero  lead  forecasts.  The  CFS 
is  a  fully  coupled  ocean-land-atmosphere  dynamical  model  used  to  forecast  at 
seasonal  lead  times  (Saha  et  al.  2006).  The  CFS  became  operational  at  NCEP 
in  late  2004  and  provides  four  daily  forecasts  that  have  nine-month  lead  times 
(Saha  2008). 

The  atmospheric  component  of  the  CFS  uses  the  coarser  resolution  of  the 
Global  Forecast  System  (GFS),  so  the  CFS  has  approximately  1.8°  resolution. 
We  extrapolated  the  CFS  to  2.5°  spatial  resolution  to  fit  our  R2  resolution.  The 
ocean  component  of  CFS  uses  the  Geophysical  Fluid  Dynamics  Laboratory 
(GFDL)  Modular  Ocean  Model  version  3  (MOM3).  Four  CFS  runs  are  executed 
daily,  with  integrations  out  to  nine  months.  We  used  all  four  CFS  runs  to  create 
ensembles  to  best  capture  the  LSEFs.  The  CFS  runs,  like  other  dynamical  long 
range  forecast  models,  tends  to  converge  toward  climatology.  To  remedy  this 
situation,  NCEP  provides  bias  correction  files  that  we  employed  to  remove  this 
systematic  error. 

The  CFS  does  not  provide  the  full  suite  of  variables  offered  by  R2,  such  as 
vorticity  and  divergence.  The  CFS  variables  that  we  used  directly  are  SST  and 
winds  at  850  and  200  mb.  However,  we  used  these  CFS  winds  to  calculate 
vorticity  and  divergence  at  850  and  200  mb,  as  done  by  Mundhenk  (2009). 

C.  VARIABLES  OF  INTEREST 

1.  Classical  LSEFs 

Though  Gray  (1968,  1975,  1979)  identified  six  large-scale  environmental 
conditions  or  factors  needed  for  TC  development  that  he  called  genesis 
parameters.  There  are  many  lists  of  LSEFs,  but  all  lists  are  slight  variations  of 
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Gray’s  original.  In  addition  to  the  presence  of  favorable  LSEFs,  TC  formation 
probably  also  requires,  in  general,  one  of  several  possible  triggering  factors — for 
example,  a  pre-existing  disturbance — in  order  for  TC  formation  to  occur  (Gray 
1968;  Emanuel  1989;  Zehr  1992). 

a.  Sea  Surface  Temperatures 

Meteorologists  have  known  the  relationship  between  high  SSTs 
and  TC  formation  for  about  60  years.  Palmen  (1948)  found  that  TCs  rarely  form 
when  SSTs  were  below  26.5  °C.  Our  research  agrees  with  his  findings,  with  only 
21  out  of  273  NA  TCs  that  formed  during  1982-2006  having  done  with  SSTs 
below  26.5  °C. 


0  1 
TC  Occurrence 


Figure  6.  Box  plots  of  SST  (in  °C)  grouped  by  whether  a  TC  formed  in  a  given 

day-grid  (“0”  indicates  no  TC  formed,  and  “1”  indicated  a  TC  formed).  The 
blue  box  encloses  50%  of  the  SST  data  points,  the  whiskers  (black 
dashed  lines)  encompass  -99%  of  the  data  points.  The  red  “+”  highlight 
-1%  of  the  data  points  that  are  outliers.  Produced  using  NOAA  OISST 
data  and  TC  occurrences  from  the  HURDAT  for  January-December  of 
1982-2006. 
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Figure  6  depicts  the  box  plots  that  separate  grid  point  values  of 
SST  on  individual  days  when  a  TC  did  not  form  (“0”,  on  the  left)  and  on  days 
when  a  TC  did  form  (“1”,  on  the  right).  Less  than  8  percent  of  the  TCs  that 
formed  in  the  NA  from  1982  to  2006  formed  with  SSTs  below  26.5  °C.  Figure  6 
also  indicates  that  the  SSTs  associated  with  TC  formations  are  relatively  distinct 
from  those  associated  with  non-formation  times  and  locations. 


b.  Humidity 


The  importance  of  mid-level  humidity  in  TC  formation  has  been  well 
documented.  Recently,  Dunkerton  et  al.  (2009)  presented  the  marsupial 
metaphor  to  describe  a  critical  layer  gyre  that  contains  mid-level  moisture  that 
favors  deep  convection  and  TC  formation. 


Normalized  Precipitable  Water 


Figure  7.  Normalized  scatter  plot  of  North  Atlantic  January-December 
precipitable  water  vs.  500  mb  relative  humidity.  Note  the  strong 
relationship  between  the  two  variables.  Constructed  from  R2  data  from 
1982-2007  for  dates  and  locations  at  which  TCs  occurred. 
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The  CFS  does  not  provide  relative  or  specific  humidity  as  output 
variables,  so  we  chose  to  use  precipitable  water  as  a  replacement  for  humidity. 
Figure  7  shows  that  there  is  a  strong  positive  correlation  (0.72)  between  R2  500 
mb  relative  humidity  and  precipitable  water  at  the  times  and  locations  at  which 
NA  TCs  formed.  Thus,  we  concluded  that  precipitable  water  is  an  adequate 
substitute  for  mid-level  relative  humidity. 


o  1 

TC  Occurrence 


Figure  8.  Box  plots  of  precipitable  water  (kg/mA2)  grouped  by  whether  a  TC 

formed  in  a  given  day-grid  (“0”  indicates  no  TC  formed,  and  “1”  indicates  a 
TC  formed).  The  blue  box  encloses  50%  of  the  precipitable  water  data 
points,  the  whiskers  (black  dashed  lines)  encompass  -99%  of  the  data 
points.  The  red  “+”  highlight  -1%  of  the  data  points  that  are  outliers. 
Produced  using  R2  data  and  TC  occurrences  from  the  HURDAT  for 
January-December  of  1982-2006. 

Figure  8  depicts  the  box  plots  for  precipitable  water  with  “0” 
indicated  the  non-TC  information  (NTCI)  and  “1”  indicating  the  TC  formation  data. 
Figure  8  also  indicates  that  the  precipitable  water  values  associated  with  TC 
formations  are  relatively  distinct  from  those  associated  with  non-formation  times 
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and  locations.  However,  the  corresponding  NTCI  and  TC  box  plots  for  500  mb 
relative  humidity  (not  shown)  are  nearly  identical.  This  indicates  that  precipitable 
water  may  be  a  better  variable  than  relative  humidity  for  our  logistic  regression 
model. 


c.  Wind  Shear 

Many  past  studies  have  shown  that  the  greater  the  vertical  wind 
shear,  the  less  likely  it  is  for  TC  formation  (Gray  1968;  Emanuel  1989;  Zehr 
1992).  As  in  most  prior  studies,  we  defined  vertical  wind  shear  as  the  200  mb 
vector  wind  minus  the  850  mb  vector  wind.  This  result  gives  us  a  magnitude  and 
direction;  however,  in  this  study  we  only  used  the  magnitude  of  the  vertical  wind 
shear. 


0  1 
TC  Occurrence 


Figure  9.  Box  plots  of  200-850mb  wind  shear  (in  m/s)  grouped  by  whether  a  TC 
formed  in  a  given  day-grid  (“0”  indicates  no  TC  formed,  and  “1”  indicated  a 
TC  formed).  The  blue  box  encloses  50%  of  the  wind  shear  data  points, 
the  whiskers  (black  dashed  lines)  encompass  -99%  of  the  data  points. 

The  red  “+”  highlight  -1  %  of  the  data  points  that  are  outliers.  Produced 
using  R2  data  and  TC  occurrences  from  the  HURDAT  for  January- 
December  of  1982-2006. 
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Figure  9  clearly  highlights  that  low  wind  shear  is  associated  with  TC 
formation  in  the  NA,  while  generally  higher  wind  shear  is  present  when  there  are 
no  TC  formations.  Seventy-five  percent  of  the  TCs  form  with  shear  below  12 
m/s  and  99  percent  form  with  a  wind  shear  less  than  22  m/s. 

d.  Upward  Vertical  Motion 

Like  500  mb  relative  humidity,  200  mb  omega  is  not  available  in  the 
CFS  package,  so  we  decided  to  use  200  mb  divergence  as  an  alternative  to 
represent  upward  vertical  motion.  Since  200  mb  divergence  is  not  available 
through  CFS  either,  we  calculated  it  via  second  order  centered  finite  differencing 
of  the  200  mb  wind.  The  correlation  between  R2  200  mb  divergence  and  500  mb 
omega  is  -0.77  (see  Figure  10).  The  negative  correlation  indicates  that  when 
500  mb  omega  is  negative  (indicating  upward  motion),  200  mb  divergence  is 
positive  (indicating  upper  level  divergence  consistent  with  mid-level  upward 
motion),  and  vice  versa.  Figure  1 1  depicts  the  box  plot  for  200  mb  divergence. 
Of  the  303  TCs  that  formed  from  1982  to  2007,  91  percent  did  so  with  upper- 
level  divergence.  Thus,  upper-level  divergence  appears  to  be  good  candidate  for 
our  logistic  model. 
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Normalized  200mb  Divergence 
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Figure  10.  Normalized  scatter  plot  of  January-December  200  mb  divergence  vs. 
500  mb  omega.  Note  the  strong  negative  correlation  between  the  two 
variables.  Constructed  from  R2  data  from  1982-2007  for  times  and 
locations  at  which  TCs  occurred. 
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Figure  1 1 .  Box  plots  of  200  mb  divergence  (in  Is)  grouped  by  whether  a  TC 

formed  in  a  given  day-grid.  The  blue  box  encloses  50%  of  the  divergence 
data  points,  the  whiskers  (black  dashed  lines)  encompass  -99%  of  the 
data  points.  The  red  “+”  highlight  -1%  of  the  data  points  that  are  outliers. 
Produced  using  R2  data  and  TC  occurrences  from  the  HURDAT  for 
January-December  of  1982-2006. 

e.  Low-Level  Vorticity 

The  occurrence  of  lower-level  positive  relative  vorticity  is  important 
for  TC  formation.  The  box  plots  in  Figure  12  indicate  that  only  seven  NA  TCs 
during  1982-2006  formed  with  negative  relative  vorticity  at  850  mb.  Thus,  we 
selected  low-level  relative  vorticity  as  a  potential  LSEF  for  our  logistic  regression 
model. 


23 


x  icr® 


850RelVort  Boxplot 


Figure  12.  Box  plots  of  850  mb  relative  vorticity  (in  Is)  grouped  by  whether  a  TC 
formed  in  a  given  day-grid.  The  blue  box  encloses  50%  of  the  vorticity 
data  points,  the  whiskers  (black  dashed  lines)  encompass  -99%  of  the 
data  points.  The  red  “+”  highlight  -1%  of  the  data  points  that  are  outliers. 
Produced  using  R2  data  and  TC  occurrences  from  the  HURDAT  for 
January-December  of  1982-2006. 

We  also  investigated  using  planetary  vorticity  and  absolute  vorticity 
as  LSEFs.  The  occurrence  of  only  one  TC  formation  south  of  7.5°  N  indicates 
that  planetary  vorticity  has  a  positive  relationship  with  TC  formation.  Thus,  we 
included  a  Coriolis  term  in  the  model  to  account  for  this  dependence  on  planetary 
vorticity  (see  Chapter  III).  We  chose  not  to  use  absolute  vorticity  as  an  LSEF, 
because  the  model  performed  better  when  using  separate  relative  vorticity  and 
Coriolis  terms  than  when  using  absolute  vorticity  alone. 


2.  Non-Classical  LSEFs 

To  reduce  the  risk  of  excluding  important  LSEFs,  we  investigated  a 
number  of  addition  variables  available  in  R2  for  use  as  LSEFs  in  the  regression 
model,  including: 

1 )  Mean  sea  level  pressure 
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2-5)  850  mb:  relative  humidity,  divergence,  omega,  and  temperature 

6-9)  500  mb:  relative  vorticity,  absolute  vorticity,  divergence  and 

temperature 

10-15)  200  mb:  relative  humidity,  relative  vorticity,  absolute  vorticity, 

divergence,  omega,  and  temperature 

16-18)  Thickness:  200-850  mb,  500-700  mb,  and  500-850  mb 

Based  on  verification  and  other  analyses  of  the  outputs  from  test  models 
that  using  the  full  range  of  LSEFs  (see  Section  D,  Chapter  II),  we  rejected 
variables  1-18  (above)  for  use  in  our  logistic  regression  model.  The  final  set  of 
LSEFs  that  we  used  (see  Section  D,  Chapter  II)  is  very  similar  to  the  set 
specified  by  Gray  (1968,  1975,  1979). 

D.  PROBABILISTIC  EQUATION  DEVELOPMENT 

1.  Logistic  Regression 

We  used  logistic  regression  to  create  our  TC  formation  probability  forecast 
using  the  LSEFs  as  our  independent  variables.  For  exact  details  on  logistic 
regression  the  reader  is  referred  to  Wilks  (2006)  or  a  similar  college  level 
statistics  book.  It  is  important  to  note  that  our  LSEFs  are  not  completely 
independent  of  each  other.  For  example,  a  region  of  positive  low-level  vorticity 
and  high  SSTs  will  also  have  high  relative  humidity.  Therefore,  positive  low-level 
vorticity  and  high  relative  humidity  are  positively  correlated.  Ideally,  this  sort  of 
correlation  between  LSEFs  would  not  exist,  and  its  existence  makes  model 
building,  and  interpretation  of  the  results,  more  challenging. 

2.  Model  Training 

We  wanted  to  use  the  best  data  available  to  train  our  regression  model,  so 
we  chose  data  only  from  the  satellite  era  (approximately  1970-present).  As 
stated  before,  we  chose  R2  and  OISST  data  to  train  our  model;  therefore,  our 
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training  data  set  was  limited  to  1982-2008.  Ultimately,  we  chose  to  include  only 
LSEF  data  from  the  peak  formation  period,  June  through  November,  to  minimize 
data  dilution  (Eckel  2008). 

As  discovered  by  Mundhenk  (2009)  in  the  WNP,  our  model  has  a 
tendency  to  underpredict.  Underprediction  occurs  when  the  forecast  probability 
is  below  the  observed  frequency.  Thus,  our  model  tends  to  forecast  lower  TC 
formation  probabilities  than  the  actual  observed  TC  formation  probabilities.  The 
reason  for  this  shortcoming  of  the  model  is  that  the  days  and  locations 
immediately  surrounding  the  time  and  place  at  which  a  TC  forms  tend  to  have 
LSEF  values  that  are  favorable  for  formation.  However,  TCs  tend  to  form  in 
relative  temporal  and  spatial  isolation  from  each  other  (with  a  small  number  of 
exceptions).  Thus,  the  model  assigns  a  lower  probability  of  formation  to  those 
conditions  based  on  the  lack  of  formation  during  the  surrounding  times  and 
locations.  The  net  result  is  that  the  regression  model  tends  to  predict  lower 
formation  probabilities  than  observed. 

To  remedy  this  underprediction,  we  chose  to  exclude  60  percent  of  the 
non-TC  information  (NTCI)  from  the  R2  and  HURDAT  data  sets.  When  100 
percent  of  the  NTCI  were  used,  the  model  vastly  under-predicted  TC  formation 
probabilities.  We  also  decided  not  to  use  data  below  7.5°  N  because  there  has 
only  been  one  NA  TC  since  1970  that  formed  south  of  7.5°  N  (see  Figure  4). 

3.  Model  Selection 

We  ran  our  regression  model  numerous  times  with  different  LSEF 
variables,  amounts  of  NTCI,  and  start  and  end  dates.  First,  we  ran  each  LSEF 
separately  to  find  which  LSEFs  had  the  strongest  relationships  with  TC 
formation.  We  then  ran  all  possible  combinations  of  these  LSEFs  to  ensure  we 
had  the  best  combination  of  variables  possible  based  on  the  results  of  various 
scoring  methods.  Second,  we  tried  40,  50,  60,  80,  and  100  percent  NTCI  to 
address  the  under-prediction  problem  mentioned  above.  Third,  we  ran  with 
various  start  and  end  dates  for  the  LSEF  and  TC  data  to  ensure  we  were  not 
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using  too  much  data  from  months  in  which  TC  formation  is  rare.  This  was  a 
delicate  balance  because  if  we  chose  just  JASO,  we  would  have  excluded  too 
many  TC  formation  points,  which  would  have  reduced  the  skill  of  our  regression 
model.  We  settled  on  using  data  from  June-November  to  develop  the  regression 
model. 

As  statistical  forecasting  is  recently  new,  especially  for  an  event  as  rare  as 
a  TC,  finding  a  suitable  set  of  methods  to  score  the  model  results  was  not  an 
easy  task.  As  discussed  by  Mundhenk  (2009),  we  selected  the  following  as  the 
tools  with  which  to  test  our  model. 

a.  Aka  ike  Information  Criterion 

Akaike  information  criterion  (AIC)  is  a  measure  of  the  goodness-of- 
fit  of  a  statistical  model  with  a  penalty  for  added  independent  variables  to  counter 
tendencies  toward  over  fitting.  Low  values  of  AIC  indicate  the  preferred  model; 
that  is,  the  model  with  the  fewest  independent  variables  that  still  provides  a  good 
fit  to  the  data.  We  direct  the  reader  to  Burnham  and  Anderson  (2002)  for  a  more 
detailed  explanation  of  AIC. 

b.  Deviance 

We  used  residual  deviance  to  compare  models  developed  from 
different  LSEFs,  NTCI,  and  start  and  end  dates.  Deviance  determines  how  well 
the  given  equation  accounts  for  variability:  the  lower  the  residual  deviance  of  a 
model,  the  better  the  goodness-of-fit  of  that  model. 

c.  Stability 

To  ensure  a  stable  model,  we  used  the  jackknife  approach  to 
create  our  model.  We  ran  the  regression  model  from  1982-2006,  first  excluding 
the  year  1982,  second  excluding  the  year  1983,  and  so  on.  We  compared  each 
of  these  logistic  regression  equations  to  ensure  stable  coefficients  (indicated  by 
coefficient  changes  that  are  small  from  one  run  to  the  next).  We  then  took,  for 
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each  LSEF,  the  average  of  all  the  coefficients  from  the  different  runs,  to  come  up 
with  the  coefficients  for  our  final  linear  regression  model  equation. 

d.  Physical  Plausibility 

One  might  expect  that  a  regression  model  would  return  an  equation 
that  is  physically  plausible,  but  this  was  not  always  the  case.  For  example,  we 
know  that  positive  low-level  relative  vorticity  is  positively  correlated  with  TC 
formation,  so  a  model  that  returned  results  with  a  negative  coefficient  for  low- 
level  relative  vorticity  would  be  identified  as  physically  implausible.  This  sort  of 
physically  implausible  coefficient  and  implied  relationship  between  an  LSEF  and 
TC  formation  probability  arose  in  some  cases  when  we  included  LSEFs  that  are 
closely  correlated  with  each  other.  For  example,  when  we  included  variables 
representing  low-level  relative  vorticity,  SST,  vertical  velocity,  and  relative 
humidity,  the  regression  process  assigned  a  negative  coefficient  to  relative 
humidity.  This  was  because  of  the  positive  correlations  between  relative 
humidity  and  the  other  LSEFs  in  the  same  model  (SST,  low  level  relative 
humidity,  vertical  velocity).  As  stated  above,  the  LSEFs  are  not  independent; 
therefore,  we  chose  to  strike  a  delicate  balance  between  goodness-of-fit  and 
model  complexity,  as  we  felt  physical  plausibility  was  an  important,  desirable 
model  attribute. 

4.  Model  Verification 

Prior  to  our  research,  no  one  had  published,  to  our  knowledge,  a  complete 
set  of  methods  for  verifying  probabilistic  forecasts  of  individual  TC  formations. 
Thus,  we  developed  a  set  of  several  verification  metrics  applied  in  unison  to 
verify  our  model.  The  metrics  we  used  are:  hits,  misses,  Brier  score  (BS)  Brier 
skill  score  (BSS),  relative  operating  characteristic  (ROC),  and  reliability. 

We  also  created  anomaly  probabilities  with  both  our  hindcasts  and 
forecasts  probabilities.  The  anomaly  probability  is  our  forecasted  probability 
minus  the  corresponding  climatological  probability.  The  resulting  forecast 
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probability  anomaly  provides  a  clear  depiction  of  how  our  forecasts  differ  from 
normal  probabilities,  which  can  be  very  useful  to  planners  who  are  familiar  with 
normal  probabilities  and  risks. 

E.  SUMMARY  OF  PREDICTION  METHOD 

Figure  13  (adapted  from  Mundhenk  2009)  illustrates  the  process  we  used 
in  this  study.  As  Mundhenk  (2009)  explains,  “this  process  is  a  combined 
statistical-dynamical  one,  wherein  one  uses  a  numerical  model  to  force  a 
statistical  model  to  generate  ensemble  based,  probabilistic,  intraseasonal 
predictions  of  TC  formations.” 


Figure  13.  Depiction  of  the  process  for  generating  intraseasonal  predictions  of 
tropical  cyclogenesis  (adapted  from  Mundhenk  2009). 
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III.  RESULTS 


A.  REGRESSION  MODEL 

We  employed  logistic  regression  to  construct  an  equation  for  the 
probability  of  TC  formation  in  a  given  2.5  0  x  2.5  0  block  on  a  given  day.  Logistic 
regression  finds  best  estimates  of  the  intercept  bo  and  the  coefficients  bk  for  each 
LSEF  (xk).  This  leads  to  the  probability  of  TC  formation  (pp)  at  a  given  day  grid 
point: 

e(b{)+b]x]+...+b6x6) 

PF  y  _j_  e(b()+b\X\+...+b6x6) 

His  equation  is  the  regression  model  developed  and  tested  in  this  study. 
Table  1  lists  the  LSEFs  and  their  coefficients  of  the  optimal  version  of  this 
equation.  The  optimal  model  was  selected  from  all  the  models  we  developed 
and  tested  by  goodness  of  fit  and  the  other  considerations  described  in  Sections 
III.A.1  and  III.A.2.  The  LSEFs  in  the  optimal  version  of  the  regression  equation 
are,  in  order  of  their  significance  as  determined  by  a  Chi-squared  test:  850  mb 
relative  vorticity,  850  mb  relative  vorticity  squared  (hereafter,  RV2),  SST,  wind 
shear,  200  mb  divergence,  and  a  term  representing  Coriolis  effects.  We  tested 
numerous  other  models,  but  this  model  provided  the  best  fit  between  our  LSEF 
data  and  TC  formation  data.  The  negative  sign  for  the  RV2  and  wind  shear 
coefficients  indicates  that  when  wind  shear  and  RV2  increase,  the  probability  of 
TC  formation  goes  down.  Conversely,  the  positive  coefficients  for  850  mb 
relative  vorticity,  SST,  200  mb  divergence,  and  the  Coriolis  term  indicate  that 
when  these  variables  increase,  the  probability  of  TC  formation  goes  up. 
Therefore,  the  model  summarized  in  Table  1  is  physical  plausible  in  terms  of  the 
coefficient  signs. 
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Variable 

Regression 

Coefficient 

Significance 

Rank 

Standard 

Error 

t  Value 

- 

(Intercept) 

b0 

-21.4805624 

- 

1.441591 

-15.10 

Xi 

850  mb  Rel.  Vort 

bi 

239718.544 

1 

14316.48 

16.81 

X2 

850  mb  Rel.  VortA2 

b2 

-2897314840 

2 

257575600 

-11.25 

x3 

SST 

b3 

0.459047508 

3 

0.04858943 

9.64 

X4 

Wind  Shear 

b4 

-0.09942719 

4 

0.01095125 

-8.99 

x5 

200  mb  Divergence 

b5 

54503.2704 

5 

8348.18 

6.38 

x6 

Coriolis 

b6 

10576.12604 

6 

3493.039 

3.12 

Table  1 .  LSEF  coefficients  and  related  descriptive  statistics  for  the  optimal 
regression  model  developed  and  tested  in  this  study.  The  model  was 
developed  using  NA  LSEF  and  TC  data  from  June-November  of  1982- 
2006,  and  included  data  from  40  percent  of  the  NTCI  time-location  blocks. 

For  comparison,  Table  2,  from  Mundhenk  (2009),  summarizes  the 
regression  model  he  developed  and  tested  for  long  range  forecasting  of  TC 
formations  in  the  WNP.  The  similarities  between  the  two  models  are  striking, 
with  both  the  LSEF  variables  and  their  significance  rankings  being  the  same  for 
the  NA  and  WNP.  Gray  (1968,  1975,  1979)  implied  that  the  genesis  parameters 
applied  in  all  the  tropical  basins,  and  the  NA  and  WNP  models  seem  to  confirm 
his  assertion.  As  noted  in  Section  II.B.1,  the  data  sets  for  the  WNP  and  the  NA 
are  vastly  different.  The  WNP  had  formation  points  when  there  was  just 
convection  present,  while  the  NA  data  set  had  formation  points  only  at  TD 
strength.  These  differences  might  have  led  one  to  expect  different  regression 
models  for  the  NA  and  WNP.  Moreover,  despite  the  similarities,  there  are  some 
notable  differences;  for  example,  the  differences  in  the  magnitude  of  the  VR2 
term.  The  VR2  term  is  included  in  the  model  to  reduce  storm  chasing,  which  is  a 
tendency  for  the  model  to  provide  high  probabilities  at  the  time  and  location  of  a 
TC  but  well  after  the  TC  has  formed.  Unlike  the  WNP,  the  NA  had  a  severe 
problem  with  storm  chasing,  which  explains  the  difference  in  the  RV2  magnitudes 
for  the  NA  and  WNP.  Also,  the  Coriolis  term  is  more  important  in  the  WNP  than 
it  is  in  the  NA.  This  is  partially  due  to  the  tendency  for  more  low  latitude 
formations  in  the  WNP  than  in  the  NA  (during  1970-2006,  only  one  TC  formed 
below  7.5°  N  in  the  NA,  while  many  storms  form  below  that  latitude  in  the  WNP). 
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Variable 

Regression 

Coefficient 

Significance 

Rank 

Standard 

Error 

t  Value 

- 

(Intercept) 

bo 

-27.41179 

- 

1.81639 

-15.09 

X] 

850mb  Rel.  Vorticity 

bi 

167645.1 

1 

7074.82 

23.69 

X2 

850mb  Rel.  Vorticity2 

b 2 

-1679802094.0 

2 

112033900 

-14.99 

X3 

SST 

b3 

0.6567593 

3 

0.06061 

10.83 

X4 

Vertical  Wind  Shear 

b4 

-0.05990173 

4 

0.00687 

-8.71 

*5 

Coriolis  Parameter 

bs 

15861.34 

5 

2646.58 

5.99 

X6 

200  mb  Divergence 

b6 

24729.49 

6 

6152.83 

4.01 

Table  2.  LSEF  coefficients  and  related  descriptive  statistics  for  the  optimal 
regression  model  developed  and  tested  for  WNP  TCs  by  Mundhenk 
(2009).  The  model  was  developed  using  WNP  LSEF  and  TC  data  from 
June-November  of  1982-2006,  and  included  data  from  40  percent  of  the 

NTCI  time-location  blocks. 

As  in  the  WNP,  we  had  to  include  two  adjustments  to  our  model  to  focus 
the  model  on  formation  days  rather  than  on  days  in  which  mature  TCs  occurred 
(Mundhenk  2009).  This  is  important  because  the  LSEFs  present  on  the  day  of 
formation  are  very  similar  to  those  after  the  day  of  formation.  To  remedy  this  we 
first  included  a  mean  sea  level  pressure  (MSLP)  filter  to  eliminate  NTCI  for  which 
MSLP  was  below  998  mb.  Of  the  273  TCs  that  formed  during  1982-2006,  the 
lowest  pressure  at  formation  was  999  mb.  Second,  we  included  the  VR2  term  to 
eliminate  the  tendency  of  storm  chasing.  After  the  formation  of  a  TC,  the  850  mb 
relative  vorticity  increases  as  the  850  mb  winds  increase;  by  adding  the  VR2 
term,  we  tried  to  account  for  the  negative  impact  an  established  TC  has  on  the 
likelihood  of  formation  of  another  TC. 

We  also  chose  to  use  40  percent  of  the  NTCI  when  developing  our  logistic 
regression.  When  we  used  100  percent  NTCI,  we  had  indications  that  the  model 
was  drastically  under-predicting.  As  mentioned  above,  this  under-prediction  is 
due  to  the  LSEFs  being  similar  around  the  TC  formation  day.  Using  40  percent 
NTCI,  we  excluded  some  of  the  grid  points  associated  with  already  formed,  or 
about  to  form  TCs.  As  a  result,  the  LSEFs  present  on  formation  day  are 
associated  in  the  regression  model  with  a  higher  probability  of  formation  than 
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when  100  percent  of  the  NTCI  is  used.  This  greatly  reduced  our  under-prediction 
problem  and  improved  the  model’s  reliability,  as  shown  in  Section  B,  Chapter  III. 

Our  model  faced  two  challenges  because  it  was  trained  on  data  with  daily 
resolution.  The  training  data  only  includes  273  TCs  that  formed  between  1982- 
2006  out  of  722,850-day  grid  blocks  in  the  training  period.  Thus,  there  is  a 
0.00037  probability  of  developing  a  TC  at  any  day  grid  block  during  the  training 
period.  Therefore,  the  daily  probability  for  TC  formation  is  extremely  low.  As  in 
the  WNP,  these  daily  probabilities  seldom  exceed  5  percent,  even  in  the  most 
favorable  regions  (Mundhenk  2009).  The  model-predicted  daily  formation 
probabilities  are  often  much  higher  than  these  overall  average  probabilities,  but 
they  are  still  low  from  the  perspective  of  many  forecast  users  (for  example,  much 
less  than  50  percent).  This  raises  the  challenge  of  how  to  present  long-range 
forecasts  in  which  the  formation  probabilities  are  generally  well  below  the 
probabilities  at  which  mission  planners  are  accustomed  to  revising  their  plans.  . 

The  second  challenge  associated  with  training  on  daily  data  is  that  skillful 
long  lead  forecasts  of  conditions  on  an  individual  day  are  very  difficult  to  produce. 
To  have  skill  at  long  lead  times,  long-range  forecasts  are  generally  time  averages 
that  are  valid  over  a  range  a  number  of  consecutive  days  (e.g.,  a  whole  season, 
month,  or  week).  This  time  averaging  reduces  the  temporal  resolution,  but 
increases  the  skill,  of  the  forecast.  This  is  in  part  because  a  time  averaged 
forecast  reduces  the  impacts  of  timing  errors  by  expanding  the  temporal  size  of 
the  forecast  target  (e.g.,  to  score  a  hit,  a  TC  needs  to  form  on  just  one  day  during 
the  seven  day  period,  rather  than  on  just  the  one  day  on  which  the  TC  occurred). 

To  remedy  these  problems,  we  investigated  summing  the  daily 
probabilities  over  three-,  five-,  and  seven-day  periods.  The  summed  probabilities 
for  all  of  these  cases  did  a  decent  job  depicting  TC  formation.  We  ended  up 
using  seven-day  summed  probability  forecasts  because  they  provided  the 
greatest  skill  while  still  using  a  relatively  short  valid  period.  Figure  14  is  an 
example  of  a  zero  lead  seven-day  summed  probability  hindcast  using  R2  and 
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OISST  LSEFs.  The  days  summed  for  this  forecast  are  2  September  through  8 
September  2002  with  the  forecast  centered  on  5  September  2002. 

The  seven-day  summed  forecasts  allow  forecasters  and  mission  planners 
to  work  with  probabilities  that:  (1)  are  large  enough  to  be  easily  interpreted;  and 
(2)  span  a  short  enough  period  to  be  operationally  useful  for  long  range  planning 
(e.g.,  for  planning  a  one  to  two  week  transit  of  a  battle  group). 
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Figure  14.  Example  of  seven-day  summed  probability  from  a  zero  lead  hindcast 
for  2-8  September  2002.  The  forecast  is  centered  on  the  248th  day  (5 
September)  of  2002.  The  black  dot  represents  TC  Fay  that  formed  on  5 
September  2002.  Contours  start  at  1  percent  and  are  in  1  percent 
increments.  This  hindcast  was  generated  using  the  regression  model 
described  at  the  beginning  of  Section  A,  Chapter  III,  and  using  40  percent 
of  the  NTCI. 

B.  VERIFICATION  OF  THE  REGRESSION  MODEL 

As  mentioned  above,  there  is  no  standard  set  of  verification  methods  for 
events  as  rare  as  TC  formations,  so  we  tested  and  applied  several  methods  to 
verify  our  model. 
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1. 


Quantitative  Verification 


We  first  used  quantitative  verification  techniques  to  evaluate  our 
regression  model.  The  reader  should  consult  Wilks  (2006)  for  additional  details 
of  the  verification  techniques  used  below.  We  conducted  hindcasts  for  all  the 
years  1982-2006  to  test  our  model  and  generate  a  large  number  of  forecasts  for 
verification.  We  used  jackknifing  to  create  our  model  (see  Section  3.c  of  Chapter 
II);  therefore  our  zero  lead  forecasts  are  independent  of  the  data  used  to  create 
the  forecasts. 

All  of  the  verification  techniques  we  used  are  based  on  dichotomous 
observation  values,  such  that  a  grid  point  has  a  value  of  “one”  if  a  TC  occurred  or 
“zero”  if  no  TC  occurred.  We  credited  a  forecast  with  a  hit  at  a  model  grid  point, 
or  a  “one”,  if  a  TC  occurred  within  a  2.5°  radius  of  the  grid  point.  This  2.5°  radius 
was  used  to  account  for  uncertainties  in  HURDAT  TC  positions,  and  to  account 
for  the  spatial  scale  of  the  LSEFs  and  TCs  early  in  their  life  cycle. 

We  used  the  Brier  skill  score  (BSS)  to  measure  the  accuracy  of  our  TC 
probability  forecasts.  Based  on  zero  lead  hindcasts  for  the  peak  season,  June- 
November  our  regression  model  has  a  BSS  of  0.039589.  A  95  percent  BSS 
confidence  interval  (0.037888  to  0.041508)  was  created  via  jackknifing  through 
each  of  the  years  in  the  data  set.  Though  these  results  are  only  slightly  above 
zero,  it  shows  that  our  model  has  greater  skill  than  climatology. 

We  also  looked  at  reliability  diagrams  (e.g.,  Figures  15-16)  to  determine 
the  bias  in  our  regression  model.  If  our  model  was  unbiased,  the  forecasted 
probability  would  match  the  observed  frequency.  As  expected,  most  of  our 
forecast  probabilities  are  in  the  0  to  0.005  bin.  The  dashed  line  in  Figure  16 
depicts  a  perfectly  reliable  model  and  points  above  the  diagonal  no  skill  line 
represent  positive  skill.  Our  model  slightly  under-predicts  below  10  percent  as 
the  results  are  slightly  above  the  dashed  perfect  reliability  line.  From  these 
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diagrams,  we  get  a  reliability  of  0.00003,  resolution  of  0.0002,  and  uncertainty  of 
0.0049.  Overall,  as  with  BSS,  the  reliability  diagrams  show  that  our  regression 
model  exhibits  skill  beyond  climatology. 


Figure  15.  Reliability  diagram  (left)  and  the  bin  histogram  (right)  for  the  zero  lead 
hindcasts  for  June-November  of  1982-  2006.  Created  with  minimum  bin 
intervals  at  0.005.  The  error  bars  on  the  reliability  diagram  represent  a  95 
percent  confidence  interval. 


Figure  16.  Reliability  diagram  (left)  for  probabilities  of  40  percent  or  less  for  the 
zero  lead  hindcasts  for  June-November  of  1982-  2006. 
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A  more  familiar  way  to  verify  a  forecast  is  looking  at  the  hit  rate  and  the 
false  alarm  rate.  The  relative  operating  characteristic  (ROC)  is  a  graphical  tool  to 
analyze  both  rates  simultaneously.  Figure  17  depicts  a  ROC  for  our  zero  lead 
hindcasts.  The  dashed  line  in  red  represents  a  model  with  zero  resolution,  while 
the  red  circle  at  (0,1)  represents  a  model  with  perfect  resolution.  As  one  can 
see,  our  model  shows  good  resolution  and  offers  value  to  the  user.  We  also 
calculated  the  ROC  skill  score  (ROCSS),  which  has  a  value  of  one  for  a  perfect 
forecast  and  less  than  zero  for  a  forecast  worse  than  climatology.  The  ROCSS 
for  our  hindcasts  is  0.72026,  which  again  shows  that  our  regression  model  does 
better  than  climatology. 


Figure  17.  ROC  diagram  for  the  zero  lead  hindcasts  for  June-November  of  1982- 
2006.  The  dashed  red  line  represents  zero  resolution.  The  red  circle  at 
(0,1)  represents  perfect  resolution. 
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2. 


Qualitative  Verification 


We  used  qualitative  verification  to  compare  high  formation  probabilities 
generated  by  the  model  with  actual  TC  activity.  We  obtained  independent  data 
by  skipping  a  year,  or  jackknifing,  during  model  development  or  using  data  from 
after  the  1982-2006  period  used  to  develop  the  model.  Figure  18  depicts  a  zero 
lead  independent  hindcast  for  13  August  2000,  developed  from  a  40  percent 
NTCI  model  that  jackknifed  the  year  2000.  As  one  can  see,  the  model  shows 
seven  to  8  percent  probabilities  for  the  TC  formation  location,  indicating  a  model 
hit  for  this  TC. 
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Figure  18.  Zero  lead  seven-day  summed  TC  formation  probability  hindcast 
centered  on  the  226th  day  (13  August)  of  2000,  constructed  from  a  40 
percent  NTCI  model  that  jackknifed  year  2000.  The  black  dot  represents 
TC  Beryl  that  formed  on  13  August  2000.  Contours  start  at  1  percent  and 
are  in  1  percent  increments. 

Figure  19  depicts  a  zero  lead  hindcast  from  R2  and  OISST  fields  for  5 
November  2008.  The  years  2007  and  2008  are  not  included  in  our  regression 
model  because:  (1)  the  R2  data  for  some  of  those  years  was  not  available  at  the 
beginning  of  our  research,  and  (2)  to  retain  an  independent  data  set  for  testing 
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and  verification.  As  one  can  see,  the  zero  lead  hindcast  on  5  November  2008 
captures  TC  Paloma  at  15  percent,  indicating  a  model  hit  for  this  TC. 
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Figure  19.  Zero  lead  seven-day  summed  TC  formation  probability  hindcast 

centered  on  the  309th  day  (5  November)  of  2008,  constructed  from  a  40 
percent  NTCI  model.  The  black  dot  represents  TC  Paloma  that  formed  on 
5  November  2008.  Contours  start  at  1  percent  and  are  in  1  percent 
increments. 

3.  Comparisons  to  Climatology 

To  verify  our  hindcasts,  we  also  compared  model  probabilities  to  those 
based  on  climatology.  Prior  to  this  study,  no  probabilistic  TC  formation 
climatology  was  available  for  the  NA.  Figure  20  depicts  the  daily  probability  that 
a  TC  will  form  at  a  given  grid  point;  note  that  the  highest  value  is  0.0666  percent. 
In  scoring  hits  and  misses  for  our  forecasts  with  respect  to  climatology,  we  gave 
the  model  forecasts  a  hit  (miss)  if  the  model’s  daily  forecast  probability  was 
greater  (less)  than  the  climatological  probability.  Using  this  method,  the  model 
scored  273  hits  and  16  misses,  for  a  hit  rate  of  94.5  percent. 
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Figure  20.  Contoured  daily  climatology  probability  of  TC  formation,  constructed 
from  HURDAT  from  the  years  1970-2007.  Values  represent  the  daily 
probability,  averaged  over  the  entire  year,  that  a  TC  will  form  in  a  given 
grid  point. 

A  weakness  in  the  raw  spatial  climatology  shown  in  Figure  20  is  that  it  has 
the  same  values  for  1  June  as  1  October,  which  is  not  consistent  with  actual  TC 
activity.  We  would  expect  the  probability  for  TC  formation  to  be  higher  on 
1  October  than  on  1  June. 

As  described  by  Mundhenk  (2009),  we  created  a  more  robust  form  of 
climatology  for  the  NA  that  varies  spatially  and  temporally.  Figure  21  displays 
the  daily  probabilities  for  1  June  and  1  October.  As  expected  with  our  robust 
climatology,  the  daily  probabilities  for  1  October  are  higher  than  the  probabilities 
for  1  June.  We  used  this  spatially  and  temporally  varying  climatology  for  the 
anomaly  forecasts  below. 
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Figure  21 .  Daily  climatology  probability  of  TC  formation  on  1  June  (top)  and  1 

October  (bottom),  constructed  from  HURDAT  from  the  years  1970-2007. 
Values  represent  the  daily  probability  that  a  TC  will  form  in  a  given  grid 
point. 

Figure  22  is  similar  to  Figure  19  but  is  shows  the  corresponding  forecast 
anomaly,  which  is  the  hindcast  probability  minus  the  corresponding  robust  daily 
climatology  described  above.  In  the  anomaly  forecast,  positive  (negative)  values 
indicate  regions  of  above  (below)  average  formation  probabilities.  Positive  and 
negative  anomaly  probabilities  have  potential  value  in  operational  planning,  since 
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they  identify  areas  where  elevated  risks  might  preclude  operations,  and  areas 
where  suppressed  risks  might  provide  opportunities  for  operations  that  do  not 
normally  exist. 


Figure  22.  Zero  lead  seven-day  summed  TC  formation  probability  hindcast 
anomaly  calculated  as  seven-day  summed  hindcast  probability  minus 
seven-day  summed  robust  daily  climatology  probability.  Centered  on  the 
309th  day  (5  November)  of  2008  and  constructed  from  a  40  percent  NTCI 
model.  The  magenta  dot  represents  TC  Paloma  that  formed  on  5 
November  2008. 


4.  Verification  Against  Deep  Convection 

Our  model  appears  to  do  quite  well  at  predicting  tropical  convection  as 
well  as  TC  formation,  similar  to  the  results  of  Mundhenk  (2009)  for  the  WNP,. 
We  chose  to  verify  our  model  against  outgoing  longwave  radiation  (OLR),  since 
low  OLL  values  highlight  deep  convection,  which  in  turn  indicate  areas  that  are 
favorable  for  TC  formation.  Figure  23  shows  an  example  of  this  verification  for 
the  six-week  lead  forecast  probabilities  for  13  Nov  2008  compared  to  the 
analyzed  OLR  on  the  same  day.  For  this  non-zero  lead  forecast,  we  used 
forecasts  of  the  LSEFs  from  CFS  to  force  the  regression  model  (as  discussed  in 
Chapter  II).  Note  that  off  the  coast  of  Panama,  there  was  a  region  of  high 

probabilities,  which  coincided  with  an  area  of  low  OLR  (cool  colors).  In  this  case, 
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there  was  not  TC  formation  in  the  Panama  region,  but  occurrence  of  deep 
convection  in  the  region  indicates  that  the  model  correctly  identified  deep 
convection  conditions  that  tend  to  be  favorable  for  TC  formation  (Gray  1968, 
1975,  1979).  These  and  similar  results  for  other  cases  indicate  that  our  model 
has  potential  for  forecasting  tropical  convection  at  intraseasonal  lead  times  (six 
weeks  in  this  example). 

SON 

40  N 

30N 

20N 

ION 

EQ 

100W  80W  SOW  40W  20W 


Mean  olr  W/m~2 


MAX =300.75  GrADS  image 

M  IN  =  1 29.5 

Figure  23.  Comparison  of  a  six-week  lead  forecast  probabilities  (top)  and  OLR 
(bottom)  for  13  November  of  2008.  OLR  image  provided  by  Physical 
Sciences  Division  (PSD),  Earth  System  Research  Laboratory,  NOAA, 
Boulder,  Colorado  (PSD  2008). 
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C.  FINDINGS  FROM  CFS  CASE  STUDIES 

We  conducted  six  non-zero  lead  hindcast  studies  from  our  archived  CFS 
data.  All  of  six  cases  showed  promising  results,  and  the  results  from  two  of  the 
cases  are  summarized  in  this  section. 

1.  Non-Zero  Lead  Hindcasts:  Paloma 

TC  Paloma  formed  off  the  east  coast  of  Nicaragua  on  5  Nov  2008  and, 
according  went  on  to  become  the  second-strongest  November  hurricane  on 
record  in  the  NA  (NHC  2009).  Figure  24  depicts  the  zero  lead  hindcast  using  R2 
LSEFs  (panel  a)  and  the  one  through  seven  week  lead  hindcast  probabilities 
created  via  our  model  when  forced  with  CFS  forecasts  of  the  LSEFs  (panels  b-h). 

Figure  24  shows  that  the  zero  lead  hindcast  has  higher  probabilities  than 
the  one  to  seven  week  lead  hindcasts.  This  result  holds  for  all  of  our  case 
studies.  This  is  one  of  several  indications,  compared  to  the  R2  LSEFs,  the  CSF 
LSEFs  have  trouble  capturing  smaller  spatial  structures  and  intensities  in  the 
LSEFs.  However,  the  one-week  forecast  from  the  CFS  does  a  great  job 
depicting  TC  Paloma  with  a  20  percent  probability  centered  on  the  formation 
location. 
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Figure  24.  Comparison  of  seven-day  summed  probabilities  of  TC  Paloma 

centered  on  5  November  2008  from:  (a)  zero  lead  hindcast  forced  by  R2 
LSEFs;  and  (b)  one  to  seven  week  lead  hindcasts  forced  by  CFS  LSEFs. 
Contours  start  at  1  percent  and  are  in  1  percent  increments. 
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Figure  24  also  shows  that  all  the  non-zero  lead  hindcasts  depict  TC 
Paloma  within  a  1  percent  probability  contour.  Weeks  two  and  three  are  the 
worst,  with  a  two  and  1  percent  probability  contour  around  the  formation  location. 
Weeks  four  and  five  depict  TC  Paloma  within  seven  and  nine  percent  probability 
contours,  respectively.  Week  six  depicts  TC  Paloma  within  a  four  percent 
probability  contour.  Week  seven  only  depicts  TC  Paloma  within  a  three  percent 
probability  contour,  but  we  think  it  is  perhaps  the  most  promising  of  the  non-zero 
lead  hindcasts  because:  (1)  it  captures  Paloma  at  a  long  lead  time;  and  (2)  it 
does  so  within  a  relatively  confined  and  specific  region  east  of  Nicaragua.  .  of 
the  out  creating  .  This  result  indicates  that  forecasts  based  on  our  model  and 
CFS  LSEFs  can  have  sufficient  skill  to  make  them  useful  in  operational  planning 
at  relatively  long  leads  (e.g.,  at  a  lead  of  one  to  two  months,  military  planners 
would  be  advised  to  try  to  avoid  operations  in  the  Mosquito  Gulf). 

Figure  25  shows  the  zero  lead  and  seven  week  lead  hindcasts  of  the  850 
mb  relative  vorticity  and  the  200  mb  divergence  LSEFs  that  correspond  to  the 
hindcasts  shown  in  Figure  24.  Note  that  the  CFS-based  seven  week  lead 
hindcasts  do  well  in  depicting  both  of  these  LSEFs  in  the  TC  Paloma  formation 
region,  which  leads  to  relatively  accurate  hindcast  probabilities.  But,  as 
expected,  the  zero  lead  hindcast  based  on  R2  LSEFs  depicts  a  greater  level  of 
spatial  structure  at  the  two  levels  when  compared  to  seven-week  lead  hindcast. 
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Figure  25.  Comparison  of  850  mb  relative  vorticity  (panels  a  and  c,  in  Is)  and  200 
mb  divergence  (panels  b  and  d,  in  Is)  for:  (a  and  b)  zero  lead  hindcasts 
based  on  R2  LSEFs;  and  (c  and  d)  seven-week  lead  hindcast  based  on 
CFS  LSEFs.  Both  hindcasts  valid  on  5  November  2008.  Corresponding 
TC  formation  probability  hindcasts  shown  in  Figure  24. 


Figure  26  compares  for  the  TC  Paloma  hindcasts  the  850  mb  relative 
vorticity  for  the  one-week  and  seven-week  hindcasts.  Note  that  the  one-week 
lead  hindcast  depicts  a  greater  level  of  spatial  structure  that  may  contribute  to  the 
greater  accuracy  of  the  one-week  lead  hindcast. 
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Figure  26.  Comparison  of  850  mb  relative  vorticity  for:  (a)  one-week  lead 

hindcast;  and  (b)  seven-week  lead  hindcast,  with  both  hindcasts  based  on 
CFS  LSEFs  and  valid  on  5  November  2008.  Corresponding  TC  formation 
probability  hindcasts  shown  in  Figure  24. 

2.  Non-Zero  Lead  Hindcasts:  Omar 

TC  Omar  was  an  interesting  case  to  study  because  the  NA  had  a  complex 
pattern  of  TC  activity.  TC  Omar  formed  on  13  October  2008,  but  TC  Nana 
formed  the  day  before  and  TD  16  formed  the  day  after.  Figure  27  depicts  the 
seven  day  summed  formation  probabilities  for  the  zero  to  four-week  lead 
hindcasts  centered  on  13  October  2008. 
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Figure  27.  Comparison  of  seven-day  summed  probabilities  of  TC  formation 

centered  on  13  October  2008  from:  (a)  zero  lead  hindcast  forced  by  R2 
LSEFs;  and  (b)  one-week,  (c)  two-week,  (d)  three-week,  and  (e)  four-week 
lead  hindcasts  forced  by  CFS  LSEFs.  Contours  start  at  1  percent  and  are 
in  1  percent  increments.  The  formation  locations  are  depicted  by  the 
colored  dots:  black  dot  for  TC  Omar  (formed  on  13  October  2008);  green 
dot  forTC  Nana  (formed  on  12  October  2008);  and  magenta  dot  forTD  16 
(formed  on  14  October  2008). 


Unlike  TC  Paloma,  the  one-week  lead  hindcast  for  TC  Omar  did  not 
perform  well  at  all,  in  fact  all  three  storms  were  missed  at  this  lead  time.  The 
two-week  lead  hindcast  represents  TD  16  within  a  13  percent  contour,  but  it  still 
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does  not  predict  TC  Omar.  The  best  results  for  Omar  are  from  the  three-week 
lead  hindcast,  with  the  formation  location  within  the  four  percent  contour  (Figure 
27,  panel  d).  For  TD  16,  the  best  results  are  from  the  two-week  lead  hindcast, 
but  the  highest  probabilities  are  to  the  southeast  of  the  formation  point. 

Having  the  highest  probabilities  to  the  southeast  of  the  corresponding 
formation  point  is  a  problem  in  many  of  our  results.  This  problem  occurs  not  just 
in  the  non-zero  lead  hindcasts  based  on  the  CFS  LSEF  forecasts,  but  also  in  the 
zero  lead  hindcasts  based  on  the  R2  data.  This  problem  is  likely  due  to  the  TC 
HURDAT  data  only  starting  at  TD  strength,  so  that  the  initial  locations  of  the  TCs 
in  the  HURDAT  data  (e.g.,  the  dots  shown  in  Figures  14,  18,  19,  22,  24,  27)  are 
to  the  northwest  of  the  actual  formation  locations.  If  so,  then  the  regression 
model,  combined  with  the  R2  and  CFS  LSEFs,  are  able  to  recognize  the 
favorable  conditions  that  existed  prior  to  the  initial  HURDAT  location  for  the  TCs. 

Figure  28  shows  the  850  mb  relative  vorticity  used  for  zero  lead  R2  based 
hindcast  and  the  one-week  lead  CFS  based  hindcasts  shown  in  Figure  27.  The 
CFS  does  not  capture  the  positive  850  mb  relative  vorticity  in  the  TC  Omar  and 
TD  16  vicinities.  Figure  29  shows  that  the  CFS  also  does  an  insufficient  job  in 
depicting  the  200  mb  divergence  in  the  TC  Omar  and  TD  16  vicinities.  These 
CFS  shortcomings  lead,  in  this  case,  to  insufficient  formation  probabilities  for  TC 
Omar  and  TD  16. 
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Figure  28.  Comparison  of  850  mb  relative  vorticity  for:  (a)  zero  lead  hindcast 
based  on  R2  LSEFs;  and  (b)  one  week  lead  hindcast  based  on  CFS 
LSEFs,  valid  on  13  October  2008.  Corresponding  TC  formation 
probability  hindcasts  shown  in  Figure  27. 
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Figure  29.  Comparison  of  200  mb  divergence  for:  (a)  zero  lead  hindcast  based  on 
R2  LSEFs;  and  (b)  one  week  lead  hindcast  based  on  CFS  LSEFs,  valid  on 
13  October  2008.  Corresponding  TC  formation  probability  hindcasts 
shown  in  Figure  27. 

The  formation  probabilities  for  TC  Omar  at  the  two-week  lead  time  (not 
shown)  are  just  as  poor  as  the  one-week  lead  time  due  to  a  poor  depiction  by  the 
CFS  of  the  850  mb  relative  vorticity  and  200  mb  divergence.  However,  the 
corresponding  formation  probabilities  for  TD  16  (not  shown)  do  a  better  job  than 
the  one-week  lead.  At  the  three  and  four-week  lead  times,  the  CFS  does  well  at 
depicting  both  850  mb  relative  vorticity  and  200  mb  divergence,  leading  to  good 
representation  of  TC  Omar  and  TD  16  in  the  corresponding  formation 
probabilities. 
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Figure  30  shows  for  TC  Omar  three-week  lead  hindcast  formation 
probability  anomaly  (see  Section  3  of  Chapter  II  for  more  on  these  anomalies). 
This  figure  shows  that  TC  Omar  and  TD  16  occurred  within  areas  that  the 
hindcast  identified  as  having  above  average  formation  probabilities.  Thus,  the 
three-week  lead  hindcast  represents  an  improvement  over  the  use  of 
climatological  probabilities.  The  similarities  between  the  formation  probability 
anomalies  shown  in  Figure  30  and  the  corresponding  formation  probabilities 
shown  in  Figure  27,  panel  d,  indicate  that  the  formation  probabilities  in  Figure  27, 
even  the  probabilities  of  just  a  few  percent,  represent  probabilities  that  exceed 
the  average  probabilities  for  this  seven-day  period.  This  is  generally  true  of  the 
hindcast  and  forecast  probabilities. 
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Figure  30.  Three-week  lead  seven-day  summed  TC  formation  probability 

hindcast  anomaly  centered  on  13  October  2008.  The  black  dot  represents 
TC  Omar  that  formed  on  13  October  2008,  the  green  dot  represents  TC 
Nana  that  formed  on  12  October  2008,  and  the  magenta  dot  represents 
TD16  that  formed  on  14  October  2008.  Compare  this  figure  to  the 
corresponding  formation  probability  hindcast  in  Figure  27,  panel  d. 
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3.  CFS  Two-week  Forecast  Comparison 

In  an  operational  sense,  our  model  needs  to  be  consistent  from  day  to 
day.  If  a  two-week  CFS  forecast  initialized  on  14  May  (forecast  probabilities 
centered  on  29  May)  accurately  predicts  high  probabilities  near  Cuba,  then  a 
two-week  CFS  forecast  initialized  on  15  May  (forecast  probabilities  centered  on 
30  May)  should  also  predict  high  probabilities  in  that  region.  A  lack  of  such 
consistency  is  an  indication  of  not  only  scientific  problems  with  the  forecasting 
system  but  also  of  potential  problems  in  using  the  forecasts  in  planning 
operations.  Inconsistent  forecasts  indicate  forecast  uncertainty  and  makes  it 
difficult  to  confidently  develop  plans  based  on  the  forecasts. 

To  check  for  day-to-day  consistency  in  our  forecasts,  we  created  seven 
two-week  forecasts  initialized  a  day  apart  from  one  another;  the  first  initialized  on 
26  September  2008  (forecast  probabilities  centered  on  10  October  2008),  and 
the  last  initialized  on  2  October  2008  (forecast  probabilities  centered  on  16 
October  2008).  During  this  forecast  period,  TC  Nana  formed  on  12  October,  TC 
Omar  formed  on  13  October,  and  TD  16  formed  on  14  October.  Figure  32  shows 
the  results  from  these  two-week  forecasts.  Note  in  this  figure  the  overall 
consistency  from  day  to  day  in  these  seven  two-week  lead  hindcasts.  This 
consistency  helps  validate  our  forecast  system  and  its  use,  and  indicates  that  the 
forecasts  from  our  system  may  be  consistent  enough  to  be  useful  to  military 
planners. 
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Figure  31 .  Seven-day  summed  TC  formation  probabilities  from  seven  two-week 
lead  hindcasts,  valid  on:  (a)  10  October  2008;  (b)  11  October  2008;  (c)  12 
October  2008;  (d)  13  October  2008;  (e)  14  October  2008;  (f)  15  October 
2008;  and  (g)  16  October  2008.  TC  Nana  (formed  on  12  October  2008)  is 
depicted  by  the  green  dot,  TC  Omar  (formed  on  13  October  2008)  is 
depicted  by  the  black  dot,  and  TD  16  (formed  on  14  October  2008)  is 
depicted  by  the  magenta  dot.  Contours  start  at  1  percent  and  are  in  1 
percent  increments.  Note  the  general  consistency  in  the  probabilities  for 
forecasts  validating  on  consecutive  days. 
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4. 


Forecasts  for  June  2009 


In  addition  to  hindcasts,  we  also  generated  experimental  forecasts  for  the 
2009  NA  TC  season.  Figure  32  below  depicts  some  of  these  forecasts  of  TC 
formation  probabilities,  all  of  which  were  initialized  on  20  May  2009.  The  valid 
periods  are  centered  on  3,  10,  17,  and  24  June  2009  with  lead  times  of  two, 
three,  four,  and  five-weeks,  respectively.  To  the  right  of  each  forecast  is  the 
corresponding  anomaly  forecast  (see  Section  3  of  Chapter  II).  Note  that  that  the 
forecasted  probabilities  in  Figure  32  are  not  extremely  high.  This  is  consistent 
with  NA  TC  activity  in  June  generally  being  low  to  modest.  Note  also  a  general 
increase  in  probabilities  from  the  first  week  to  the  last  week,  also  consistent  with 
historical  TC  activity  patterns  (e.g.,  Figure  5). 

Since  the  valid  periods  for  the  forecasts  shown  in  Figure  32  will  occur  after 
the  writing  of  this  report,  the  verification  of  these  forecasts  is  beyond  the  scope  of 
this  study.  However,  readers  may  verify  these  forecasts  using  TC  data 
information  at  http://www.nhc.noaa.gov/pastall.shtml  and  OLR  data  at 
http://www.cdc.noaa.gov.  The  verification  of  these  forecasts  will  be  useful  in 
assessing  the  potential  value  of  such  forecasts  to  military  planners. 
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Figure  32.  Seven-day  summed  TC  formation  probability  forecasts  (left  column) 
and  TC  formation  probability  forecast  anomalies  (right  column)  from  four 
forecasts  initialized  on  20  May  2009  and  valid  on:  (a)  03  June  2009;  (b)  10 
June  2009;  (c)  17  June  2009;  (d)  24  June  2009.  Contours  start  at  1 
percent  and  are  in  1  percent  increments. 
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5.  Forecasts  for  July  2009 

Figure  33  shows  forecasts  of  TC  formation  probabilities,  initialized  on  20 
May  2009  and  valid  for  the  seven  day  periods  centered  on  each  Wednesday  of 
July  2009  (i.e.,  1,  8,  15,  22,  and  29  July  2009).  To  the  right  of  each  forecast  is 
the  corresponding  anomaly  forecast  (see  Section  3  of  Chapter  II).  Note  that  that 
the  forecasted  probabilities  in  Figure  32  are  not  extremely  high.  This  is 
consistent  with  NA  TC  activity  in  June  generally  being  low  to  modest.  Note  also 
a  general  increase  in  probabilities  from  the  first  week  to  the  last  week,  also 
consistent  with  historical  TC  activity  patterns  (e.g.,  Figure  5). 

The  LSEFs  in  July,  compared  to  June,  should  be  more  favorable  for  TC 
formation;  therefore,  we  should  see  higher  probabilities  in  July.  In  fact,  the 
forecasted  July  2009  probabilities  are  higher  than  those  for  June  2009  and 
indicate  that  more  of  the  NA  is  favorable  for  TC  development.  The  increase  in 
favorable  areas  is  especially  noticeable  in  the  main  development  region  for  NA 
TCs,  the  tropical  Atlantic  between  northern  South  America  and  western  North 
Africa  (see  Section  B.1  of  Chapter  II).  This  is  consistent  with  historical  TC 
activity,  with  July  having  experienced  during  1970-2007  almost  twice  as  many  TC 
formations  as  June  (see  Figure  5).  As  discussed  in  the  prior  section,  we  leave  it 
to  the  reader  to  verify  these  forecasts. 
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Figure  33.  Seven-day  summed  TC  formation  probability  forecasts  (left  column) 
and  TC  formation  probability  forecast  anomalies  (right  column)  from  five 
forecasts  initialized  on  20  May  2009  and  valid  on:  (a)  01  July  2009;  (b)  8 
July  2009;  (c)  15  July  2009;  (d)  22  July  2009;  and  (e)  29  July  2009. 
Contours  start  at  1  percent  and  are  in  1  percent  increments. 


60 


6. 


General  Observations 


We  have  shown  that  using  the  R2  data,  we  can  skillfully  hindcast  at  zero 
lead  times  the  formation  times  and  locations  of  individual  TCs  at  intraseasonal 
lead  times.  To  hindcast  or  forecast  at  non-zero  lead  times,  we  need  accurate 
long  range  forecasts  of  the  LFESs  to  force  our  regression  model.  The  CFS 
appears  to  have  potential  for  producing  these  LSEF  forecasts.  Our  results 
indicate  that  the  skill  of  the  CFS  forecasts  may  be  somewhat  more  consistent  at 
lead  times  longer  than  three-weeks  than  at  shorter  lead  times.  We  suspect  that 
this  difference  in  consistency  may  be  due  to  a  greater  tendency  at  shorter  leads 
for  the  CFS  to  depict  the  small=scale  circulations  associated  with  individual  TCs, 
which  then  lead  to  high  formation  probabilities  from  the  regression  model.  If  so, 
and  if  these  circulations  are  inconsistently  forecasted  by  the  CFS  (as  seems 
likely),  then  the  shorter  lead  formation  probability  forecasts  will  also  be 
inconsistent.  At  longer  lead  times,  the  bias  corrected  CFS  forecasts  appear  to 
tend  to  smooth  out  the  smaller  scale  circulations  and  to  produce  more 
consistency  in  forecasting  the  LSEFs. 

A  recurring  problem  in  our  model  is  the  tendency  to  forecast  high  TC 
formation  probabilities  north  of  Panama.  As  seen  in  Figure  4,  that  region  has 
only  produced  a  handful  of  TCs  during  1970-2007.  Thus,  our  model  has  a 
tendency  to  over-predict  TC  formation  probabilities  in  this  region.  We  tried  a  few 
different  approaches  to  correcting  this  problem  that  reduced  the  problem  but  did 
not  eliminate  it.  It  may  be  that  a  bias  correction  needs  to  be  applied  to  the 
model’s  output  for  this  region.  It  is  useful  to  note,  however,  that  high  probabilities 
in  this  region  are  consistent  with  a  tendency  for  deep  convection  in  this  region 
during  June-November  (see  Figure  34).  Our  model  highlights  this  convectively 
active  region  and  produces  high  TC  formation  probabilities  as  a  result. 
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Figure  34.  Long  term  mean  ORL  for  July-November  1982-2008  (PSD  2008). 

Another  problem  with  the  model  is  a  timing  issue,  which  is  probably  due  to 
HURDAT.  The  highest  probabilities  associated  with  a  given  TC  tend  to  occur 
prior  to  and  to  the  southeast  of  the  initial  HURDAT  time  and  location.  As 
mentioned  in  Chapter  II,  HURDAT  only  has  TC  data  from  the  time  and  location  at 
which  a  TC  reached  TD  intensity.  Thus,  our  model  probably  tends  to  identify 
high  probabilities  prior  to  the  initial  HURDAT  date  for  a  TC  (i.e.,  prior  to  the  TC 
reaching  TD  intensity). 

As  with  any  long  lead  forecast,  with  increasing  lead  times,  the  skill  of  the 
forecast  tends  to  decrease.  As  mentioned  earlier,  we  addressed  this  problem  by 
forecasting  for  seven  day  valid  periods  using  seven-day  summed  probabilities. 
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IV.  SUMMARY,  CONCLUSIONS,  AND  RECOMMENDATIONS 


A.  KEY  RESULTS  AND  CONCLUSIONS 

In  this  thesis,  we  created  and  tested  a  combined  statistical-dynamical 
model  to  predict  the  probability  of  TC  formation  at  daily,  2.5°  horizontal  resolution 
in  the  NA  at  intraseasonal  lead  times.  We  trained  the  model  using  R2  and  NOAA 
OISST  data,  with  the  results  summarized  in  Table  1.  We  then  forced  the  model 
with  operational  CFS  forecasts  of  the  LSEFs.  Using  the  CFS  forecasts  to  force 
our  regression  model,  we  have  the  potential  to  predict  TC  formations  at  lead 
times  of  up  to  several  seasons.  However,  in  this  thesis  our  longest  lead  time  for 
a  verified  hindcast  or  forecast  was  seven  weeks.  In  this  seven-week  forecast, 
our  model  depicted  TC  Paloma  within  the  3  percent  contour  and  above  the 
climatological  probability. 

During  verification  of  our  model,  we  tested  the  predictive  potential  of  our 
model  using  quantitative  and  qualitative  verification  of  R2  and  NOAA  OISST 
based,  zero  lead  hindcasts.  Our  model  showed  great  potential  with  a  BSS  of 
0.0396,  a  ROCSS  of  0.720,  and  a  hit  rate  of  94.5  percent.  We  also  found  that 
our  model  verified  well  against  tropical  deep  convection  (as  indicated  by  OLR). 
This  is  because  the  model’s  TC  formation  probabilities  are  closely  related  to  the 
probability  of  deep  convection  in  general.  This  deep  convection  sometimes  leads 
to  TC  formation,  and  other  times  lead  to  storms  with  winds  below  the  TD 
threshold.  From  a  practical  standpoint,  for  the  military  planner,  areas  of  elevated 
tropical  cyclone  formation  probabilities  are  regions  that  are  good  to  avoid 
whether  the  result  is  TC  formation,  or  merely  deep  convection. 

We  found  that  the  model  for  the  NA  is  very  similar  to  that  for  the  WNP 
(Mundhenk  2009).  This  supports  the  idea  that  TCs  have  similar  relationships  to 
LSEFs  in  all  the  tropical  basins  in  which  TCs  they  form.  It  also  indicates  that 
eventually  one  version  of  our  model  may  apply  to  all  these  basins. 
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We  also  used  our  model  to  produce  non-zero  lead  hindcasts  based  on 
CFS  predictions  of  the  LSEFs.  The  skill  at  leads  times  greater  than  two  weeks 
was  encouraging  However,  CFS  has  difficulty  in  forecasting  the  small  and 
intense  low  level  circulations  that  tend  to  occur  during  TC  formations.  This  CFS 
shortcoming  tended  to  reduce  the  accuracy  of  these  hindcasts.  The  most 
significant  weakness  of  our  forecasting  system  is  the  inability  of  CFS  to  properly 
forecast  small  and/or  intense  features  in  the  LSEFs. 

B.  APPLICABILITY  TO  DOD  OPERATIONS 

There  is  a  gap  between  existing  climate  and  long  range  forecasting 
capabilities  and  the  products  presently  being  used  to  support  DoD  customers.  It 
is  unclear  whether  this  gap  exists  because  DoD  support  providers  and  customers 
do  not  realize  that  advanced  forecast  products  exist  and  are  being  operationally 
used  by  the  civilian  climate  centers,  and/or  because  of  shortcomings  in  DoD 
resources.  Unlike  the  DoD,  the  civilian  sector  does  not  have  this  gap  between 
capability  and  products. 

Perhaps  one  way  to  close  the  DoD  gap  is  to  better  inform  DoD  customers 
of  the  capabilities  and  identify  their  requirements  for  the  products  that  can  be 
generated  by  these  capabilities.  In  a  discussion  of  operational  climate  prediction 
at  CPC,  O’Lenic  et  al.  (2007)  states  that  “improvements  in  the  science  and 
production  methods  of  LRFs  [long-range  forecasts]  are  increasingly  being  driven 
by  users,  who  are  finding  an  increasing  number  of  applications,  and  demanding 
improved  access  to  forecast  information.” 

The  challenge  of  closing  this  gap  lies  in  the  hands  of  the  DoD  meteorology 
and  oceanography  community.  The  majority  of  DoD  planning  occurs  at  two- 
week  to  intraseasonal  lead  times,  a  timeframe  in  which  planners  are  not 
accustomed  to  seeing  advanced  climate  and  long  range  forecasting  products. 
We  need  to  show  the  mission  planners  the  various  products  we  can  provide  at 
long-range  lead  times  that  consistently  beat  standard  climatology.  In  this  thesis, 
we  have  identified  several  advanced  products  for  planning  operations  in  TC- 
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prone  regions,  including  improved  climatologies  of  TC  probabilities  and  long- 
range  forecasts  of  TC  formations  in  the  NA.  Mundhenk  (2009)  provided  similar 
evidence  for  improved  products  for  long  range  planning  in  the  WNP. 

C.  RECOMMENDATIONS  FOR  FUTURE  STUDY 

Though  this  is  the  third  Naval  Postgraduate  School  thesis  on  statistical- 
dynamical  long-range  forecasts  of  TC  formations,  there  are  plenty  of  areas  that 
still  need  to  be  investigated  further.  Areas  for  future  study  include,  but  are  not 
limited  to  the  following  topics  and  questions: 

1)  Is  there  a  better  NA  TC  data  set  than  HURDAT?  A  better  data  set 
could  alleviate  some  of  the  timing  and  placement  issues  with  our 
forecast.  If  there  is  no  better  data  set,  then  perhaps  the  DoD  or 
NOAA  should  invest  in  one. 

2)  In  concert  with  1),  climatology  for  TC  formation  is  very  poorly 
defined.  A  research  effort  to  more  completely  and  accurately 
define  TC  climatology  would  be  very  beneficial  not  just  to  military 
planners,  but  to  many  other  government  and  business  planners  as 
well. 

3)  We  have  seen  the  propensity  of  our  model  to  chase  TCs  that  have 
already  formed  and  thus  produce  high  probabilities  after  TC 
formation.  In  addition  to  the  RV2  term,  is  there  another  variable  or 
filter  that  we  can  use  to  eliminate  these  high  post-formation 
probabilities? 

4)  More  investigation  should  be  done  concerning  the  amount  of  NTCI 
used  to  train  the  regression  model.  Though  we  tested  40,  50,  60, 
80,  and  100  percent  NTCI,  more  tests  need  to  be  conducted  to 
determine  the  optimal  amount  of  NTCI. 

5)  Can  a  multiday  or  multi-model  ensemble  technique  be  used  to 
produce  better  non-zero  lead  forecasts? 
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6)  The  use  of  a  different  long-range  forecasting  model  for  predicting 
the  LSEFs  might  lead  to  more  LSEFs  for  consideration  in  the 
training  of  the  regression  model.  This  would  allow  more  choice  in 
building  the  best  regression  model,  and  would  reduce  the  potential 
for  errors  that  are  introduced  when  LSEFs  needed  by  the 
regression  model  have  to  be  calculated  from  other  LSEFs  rather 
than  derived  by  the  long-range  forecast  model.  Further  research 
should  compare  long  lead  forecasts  from  different  models  (e.g., 
models  from  ECMWF,  advanced  CFS,  GFS,  Goddard  Space  Flight 
Center,  or  the  Australian  Bureau  of  Meteorology). 
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